Nearly Optimal Semi-Supervised Learning on Subgraphs

نویسندگان

  • Gayatree Ganu
  • Branislav Kveton
چکیده

The harmonic solution (HS) on a graph is one of the most popular approaches to semi-supervised learning. This is the first paper that studies how to identify highly confident HS predictions on a graph based on the HS on its subgraph. The premise of our method is that the subgraph is much smaller than the graph and therefore the most confident predictions can be identified much faster than computing the HS on the graph. We introduce a class of subgraphs that allow for good approximations, prove bounds on the difference in the HS on the graph and its subgraph, and propose an efficient approach to building the subgraphs. Our solution is evaluated in the domains of handwritten digit recognition, and topic discovery in restaurant and hotel reviews. In all cases, we show that only a small portion of the graph is sufficient to identify highly confident predictions.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient semi-supervised learning on locally informative multiple graphs

We address an issue of semi-supervised learning on multiple graphs, over which informative subgraphs are distributed. One application under this setting can be found in molecular biology, where different types of gene networks are generated depending upon experiments. Here an important problem is to annotate unknown genes by using functionally known genes, which connect to unknown genes in gene...

متن کامل

Graph Partition Neural Networks for Semi-Supervised Classification

We present graph partition neural networks (GPNN), an extension of graph neural networks (GNNs) able to handle extremely large graphs. GPNNs alternate between locally propagating information between nodes in small subgraphs and globally propagating information between the subgraphs. To efficiently partition graphs, we experiment with several partitioning algorithms and also propose a novel vari...

متن کامل

Semisupervised learning using feature selection based on maximum density subgraphs

We present a new graph based semi-supervised learning algorithm, using multiway cut on a neighborhood graph to achieve an optimum classification. We also present a graph based feature selection algorithm utilizing the global structure of the graph derived from both labeled and unlabeled examples. With respect to the experiments we conducted, both of our approaches are proved to have a promising...

متن کامل

Data dependent kernels in nearly-linear time

We propose a method to efficiently construct data-dependent kernels which can make use of large quantities of (unlabeled) data. Our construction makes an approximation in the standard construction of semi-supervised kernels in Sindhwani et al. (2005). In typical cases these kernels can be computed in nearly-linear time (in the amount of data), improving on the cubic time of the standard constru...

متن کامل

Semi-supervised learning of hierarchical representations of molecules using neural message passing

With the rapid increase of compound databases available in medicinal and material science, there is a growing need for learning representations of molecules in a semi-supervised manner. In this paper, we propose an unsupervised hierarchical feature extraction algorithm for molecules (or more generally, graph-structured objects with fixed number of types of nodes and edges), which is applicable ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013